Advanced Metrics for Class-Driven Similarity Search
نویسندگان
چکیده
This paper presents two metrics for the Nearest Neighbor Classifier that share the property of being adapted, i.e. learned, on a set of data. Both metrics can be used for similarity search when the retrieval critically depends on a symbolic target feature. The first one is called Local Asymmetrically Weighted Similarity Metric (LASM) and exploits reinforcement learning techniques for the computation of asymmetric weights. Experiments on benchmark datasets show that LASM maintains good accuracy and achieves high compression rates outperforming competitor editing techniques like Condensed Nearest Neighbor. On a completely different perspective the second metric, called Minimum Risk Metric (MRM) is based on probability estimates. MRM can be implemented using different probability estimates and performs comparably to the Bayes classifier based on the same estimates. Both LASM and MRM outperform the NN classifier with the Euclidean metric.
منابع مشابه
Identification of BKCa channel openers by molecular field alignment and patent data-driven analysis
In this work, we present the first comprehensive molecular field analysis of patent structures on how the chemical structure of drugs impacts the biological binding. This task was formulated as searching for drug structures to reveal shared effects of substitutions across a common scaffold and the chemical features that may be responsible. We used the SureChEMBL patent database, which prov...
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملYencken, Lars and Timothy Baldwin (2008) Orthographic similarity search for dictionary lookup of Japanese words, In Proceedings of the 18th European Conference on Artificial Intelligence (ECAI-08), Patras, Greece
Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of distance metrics for characters to allow learners to leverage known characters to search for words containing unknown b...
متن کاملOrthographic similarity search for dictionary lookup of Japanese words
Finding an unknown Japanese word in a dictionary is a difficult and slow task when one or more of the word’s characters is unknown. For advanced learners, unknown characters evoke the form and meaning of visually similar characters they are familiar with. We propose a range of character distance metrics to allow learners to leverage known characters to search for words containing unknown but vi...
متن کاملOn the Foundations of Data Interoperability and Semantic Search on the Web
Title of Document: ON THE FOUNDATIONS OF DATA INTEROPERABILITY AND SEMANTIC SEARCH ON THE WEB Hamid Haidarian Shahri, Doctor of Philosophy, 2011 Directed By: Professor Donald Perlis Department of Computer Science This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999